AITopics

Country: Asia > China (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (0.67)
Energy > Power Industry (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.41)

Neural Information Processing SystemsFeb-16-2026, 14:41:36 GMT

Exploring and Interacting with the Set of Good Sparse Generalized Additive Models

The Rashomon set is the set of models that are approximately as good as the best model in the class. That is, it is the set of all good models.

artificial intelligence, data mining, machine learning, (17 more...)

Country:

North America > United States (0.04)
North America > Canada > British Columbia (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Data Science > Data Mining (0.68)

Neural Information Processing SystemsFeb-15-2026, 18:54:53 GMT

Learning Cut Generating Functions for Integer Programming

CGF can outperform the GMI cuts for certain distributions.

artificial intelligence, machine learning, optimization problem, (17 more...)

Country:

North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California (0.04)
Europe > Belgium > Wallonia > Liège Province > Liège (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Tabib, H. M. Shadman, Deedar, Jaber Ahmed

Toward Trustworthy Difficulty Assessments: Large Language Models as Judges in Programming and Synthetic Tasks

arXiv.org Artificial IntelligenceNov-25-2025

Large Language Models (LLMs) have demonstrated impressive capabilities in natural language and code generation, and are increasingly deployed as automatic judges of model outputs and learning activities. Yet, their behavior on structured tasks such as predicting the difficulty of competitive programming problems remains under-explored. We conduct a systematic comparison of GPT-4o, used purely as a natural-language difficulty assessor, against an interpretable Light-GBM ensemble trained on explicit numeric and textual features. On a dataset of 1,825 LeetCode problems labeled Easy, Medium, or Hard, LightGBM attains 86% accuracy, whereas GPT-4o reaches only 37.75%. Detailed analyses, including confusion matrices and SHAP-based interpretability, show that numeric constraints -- such as input size limits and acceptance rates -- play a crucial role in separating Hard problems from easier ones. By contrast, GPT-4o often overlooks these cues and exhibits a strong bias toward simpler categories. We further probe GPT-4o through a synthetic Hard-problem generation protocol. Surprisingly, GPT-4o labels almost all of its own synthetic Hard problems as Medium, contradicting its tendency to downgrade real Hard problems to Easy. Our findings connect to recent work on LLMs-as-judges and automatic difficulty estimation in programming and education, and highlight concrete failure modes that must be addressed before LLM-based judges can be considered trustworthy in competitive programming, educational platforms, or reinforcement-learning pipelines.

large language model, machine learning, natural language, (19 more...)

2511.18597

Genre: Research Report > New Finding (0.67)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-10-2025, 18:54:08 GMT

GLinSA T: The General Linear Satisfiability Neural Network Layer By Accelerated Gradient Descent

Constrained decision-making problems are pervasive across various disciplines.

constraint, experiment, satisfiability layer, (16 more...)

Country: Asia > China (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (0.67)
Energy > Power Industry (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.41)

Neural Information Processing SystemsOct-10-2025, 05:45:55 GMT

71008846945765893f43abe829090bf8-Paper-Conference.pdf

cut generating function, plane, tree size, (14 more...)

Country:

North America > United States > New York (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Neural Information Processing SystemsOct-9-2025, 05:06:07 GMT

b1719f44953c2e0754a016ab267fe4e7-Paper-Conference.pdf

artificial intelligence, data mining, machine learning, (17 more...)

Country:

North America > United States (0.04)
North America > Canada > British Columbia (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Data Science > Data Mining (0.68)

arXiv.org Artificial IntelligenceOct-7-2025

Refactoring Codebases through Library Design

Kovacic, Ziga, Chiu, Justin T., Lee, Celine, Zhao, Wenting, Ellis, Kevin

Maintainable and general software allows developers to build robust applications efficiently, yet achieving these qualities often requires refactoring specialized solutions into reusable components. This challenge becomes particularly relevant as code agents become used to solve isolated one-off programming problems. We investigate code agents' capacity to refactor code in ways that support growth and reusability. We first investigate what makes a good refactoring, finding via simulation results and a human study that Minimum Description Length best correlates with preferable refactorings. We then present both a benchmark and a method for refactoring: MiniCode, a benchmark where multiple files must be refactored into a shared library, and Librarian, a sample-and-rerank method for generating reusable libraries. We compare Librarian to state-of-the-art library generation methods, and study it on real-world code bases.

large language model, machine learning, programming language, (23 more...)

2506.11058

Genre: Research Report (0.82)

Industry: Education (0.47)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

arXiv.org Artificial IntelligenceSep-30-2025

Enhancing Delta Compression in LLMs via SVD-based Quantization Error Minimization

Xiong, Boya, Wang, Shuo, Ge, Weifeng, Chen, Guanhua, Chen, Yun

Fine-tuning is a crucial process for adapting large language models (LLMs) to diverse applications. In certain scenarios, like multi-tenant serving, a large number of LLMs finetuned from the same base model are deployed to meet complex requirements for users. Recent works explore delta-compression approaches to quantize and compress the delta weights between the customized LLM and the corresponding base model. However, they exhibit inadequate performance at high compression ratios due to their empirical nature. In this work, we introduce DeltaMix, an adaptive mixed-precision delta-compression framework designed to minimize quantization error in the singular value decomposition (SVD) space without imposing additional assumptions. DeltaMix provides a theoretical justification for the necessity of mixed-precision compression and presents a practical quantization solution that involves solving a 0/1 linear integer programming problem alongside a reconstruction target correction method. Experimental results across multiple models and benchmarks illustrate that DeltaMix consistently outperforms all baseline methods. Notably, on tasks such as AIME2024 and GQA, DeltaMix exceeds the performance of the best baseline, Delta-CoMe, by 22.3\% and 6.1\% for 7B parameter models, respectively.

large language model, machine learning, natural language, (20 more...)

2506.11087

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Dascalescu, Stefan, Dumitran, Adrian Marius, Vasiluta, Mihai Alexandru

Leveraging Generative AI for Enhancing Automated Assessment in Programming Education Contests

arXiv.org Artificial IntelligenceSep-30-2025

Competitive programming contests play a crucial role in cultivating computational thinking and algorithmic skills among learners. However, generating comprehensive test cases to effectively assess programming solutions remains resource-intensive and challenging for educators. This paper introduces an innovative NLP-driven method leveraging generative AI (large language models) to automate the creation of high-quality test cases for competitive programming assessments. We extensively evaluated our approach on diverse datasets, including 25 years of Romanian Informatics Olympiad (OJI) data for 5th graders, recent competitions hosted on the Kilonova.ro platform, and the International Informatics Olympiad in Teams (IIOT). Our results demonstrate that AI-generated test cases substantially enhanced assessments, notably identifying previously undetected errors in 67% of the OJI 5th grade programming problems. These improvements underscore the complementary educational value of our technique in formative assessment contexts. By openly sharing our prompts, translated datasets, and methodologies, we offer practical NLP-based tools that educators and contest organizers can readily integrate to enhance assessment quality, reduce workload, and deepen insights into learner performance.

large language model, machine learning, natural language, (19 more...)

2506.0599

Country: Europe > Romania (0.30)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Assessment & Standards (1.00)
Education > Educational Setting > K-12 Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.72)